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Abstract 

We study n x n symmetric random matrices H , possibly discrete, with 
iid above-diagonal entries. We show that H is singular with probability at 
most exp(— n"^), and — 0{y/ri). Furthermore, the spectrum of H 

is delocalized on the optimal scale o(n~^/^). These results improve upon 
a polynomial singularity bound due to Costello, Tao and Vu, and they 
generalize, up to constant factors, results of Tao and Vu, and Erdos, Schlein 
and Yau. 
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1 Introduction 

1.1 Invertibility problem 

This work is motivated by the invertibility problem for n x n random matrices 
H. This problem consists of two questions: 

1. What is the singularity probability ¥{H is singular}? 

2. What is the typical value of the spectral norm of the inverse, 

A motivating example is for random Bernoulli matrices B whose entries are 
±1 valued symmetric random variables. If all entries are independent, it is con- 
jectured that the singularity probability of B is (i+o(l))"', while the best current 
bound (--i= + o(l))" is due to Bourgain, Vu and Wood [2]. The typical norm of the 

inverse in this case is ||-B~^|| = 0{^/n) [HI [20], see [15]. Moreover, the following 
inequality due to Rudelson and the author [13] simultaneously establishes the 
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exponentially small singularity probability and the correct order for the norm of 
the inverse: 

p|mmsfc(B) <en-i/2| <C7e + 2e~"", (1.1) 

where C, c > are absolute constants. Here Sk{B) denote the singular values 
of B, so the matrix B is singular iff min^ Sfc(i?) = 0; otherwise min^ Sfe(i?) = 
l/\\B-% 

Less is known about the invertibility problem for symmetric Bernoulli matri- 
ces H, where the entries on and above the diagonal are independent ±1 valued 
symmetric random variables. As is the previous case of iid entries, it is even 
difficult to show that the singularity probability converges to zero as n — )• oo. 
This was done by Costello, Tao and Vu @] who showed that 

F{H is singular} = 0(n"^/^+^) (1.2) 

for every S > 0. They conjectured that the optimal singularity probability bound 
is for symmetric Bernoulli matrices is again (^ + o(l))"'. 

1.2 Main result 

In this paper, we establish a version of (jl.ip for symmetric random matrices. To 
give a simple specific example, our result will yield both an exponential bound 
on the singularity probability and the correct order of the norm of the inverse 
for symmetric Bernoulli matrices: 

F{H is singular} < 2e""'; P{||F"^|| < C^/^} > .99 

where C, c > are absolute constants. 

Our results will apply not just for Bernoulli matrices, but also for general 
matrices H that satisfy the following set of assumptions: 

(H) H = (hij) is a real symmetric matrix. The above-diagonal entries hij, i < 
j, are independent and identically distributed random variables with zero 
mean and unit variance. The diagonal entries ha can be arbitrary numbers 
(either non-random, or random but independent of the off-diagonal entries). 

The eigenvalues of in a non-decreasing order are denoted by Xk{H). 

Theorem 1.1 (Main). Let H be an n x n symmetric random matrix satisfying 
(H) and whose off-diagonal entries have finite fourth moment. Let K > 0. Then 
for every z gM and e > 0, one has 

P| min|Afc(F)-z| < en"^/^ max|Afc(i?)| < Ky/n] < Ce^/^ + 26""'. (1.3) 

L fe k ) 

Here C, c > depend only on the fourth moment of the entries of H and on K. 
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The bound on the spectral norm \\H\\ = max^ \ Xk(H)\ can often be removed 
from (jl.3p at no cost, as one always has \\H\\ = 0{y/n) with high probability 
under the four moment assumptions of Theorem 11.11 see Theorem 11.51 for a 
general result. 

Moreover, for some ensembles of random matrices one has \\H\\ = 0{y/n) with 
exponentially high probability. This holds under the higher moment assumption 
that 

Eexp(/i2./M2) <e, i^j (1.4) 

for some number M > 0. Such random variables hij are called sub-gaussian ran- 
dom variables, and the minimal number M is called the sub-gaussian moment 
of hij. The class of sub-gaussian random variables contains standard normal, 
Bernoulli, and generally all bounded random variables, see [23] for more infor- 
mation. For matrices with subgaussian entries, it is known that \\H\\ = 0{y/n) 
with probability at least 1 — 2e~", see Lemma 12.31 Thus Theorem 11.11 implies : 

Theorem 1.2 (Subgaussian). Let H he an n x n symmetric random matrix 
satisfying (H), whose off-diagonal entries are subgaussian random variables, and 
whose diagonal entries satisfy < Ky/n for some K . Then for every z G M 
and e > 0, one has 

P| mm\Xk{H) -z\< en-^/^j < ^ 26""'. (1.5) 

Here c > and C depend only on the sub-gaussian moment M and on K. 

Singularity and invertibility. For e = 0, Theorem 11.21 yields an exponential 
bound on singularity probability: 

F{H is singular} < 2e""". 

Furthermore, since min^ \Xj^{H) — z\ = \\{H — zl)~^\\, ()1.5|) can be stated as a 
bound on the spectral norm of the resolvent, 

f[\\{H - ziy^W >^}< Ce^/"^ + 2e-"^ 

This estimate is valid for all z G M and all e > 0. In particular, we have 

\\{H - ziy^W = 0{^/n) with high probability. (1.6) 

For z = this yields the bound on the norm of the inverse, and on the condition 
number of H: 

= 0(Vn), k{H) := \\H\\\\H-'^\\ = 0(n) with high probability. (1.7) 

In these estimates, the constants implicit in O(-) depend only on M, K and the 
desired probability level. 



4 



Delocalization of eigenvalues. Theorem 11.21 is a statement about delocal- 
ization of eigenvalues of H. It states that, for any fixed short interval / C M of 
length |/| = o(n~^/^), there are no eigenvalues in / with high probability. This 
is consistent with the simple heuristics about eigenvalue spacings. According 
to the spectral norm bound, all n eigenvalues of H lie in the interval of length 
0{^/n). So the average spacing between the eigenvalues is of the order n~^/^. 
Theorem 11.21 states that, indeed, any interval of smaller length o(n~^/^) is likely 
to fall in a gap between consequtive eigenvalues. For results in the converse di- 
rection, on good localization of eigenvalues around their means, see [23] and the 
references therein. 

Related results. A result of the type of Theorem 11.21 was known for random 
matrices H whose entries have continuous distributions with certain smoothness 
properties, and in the bulk of spectrum, i.e. for \z\ < (2 — 5)^/n (and assuming 
that the diagonal entries of H are independent random variables with zero mean 
and unit variance). A result of Erdos, Schlein and Yau [5] (stated for complex 
Hermitian matrices) is that 

P| mm \\k{H) - z\< en"^/2| < (j^ ^^ g) 

This estimate does not have a singularity probability term 2e~"''^ that appears in 
(jl.Sp . which is explained by the fact that matrices with continuous distributions 
are almost surely non-singular. In particular, this result does not hold for discrete 
distributions. 

Some related results which apply for discrete distributions are due to Tao and 
Vu. Theorem 1.14 in j22j states that for every 5 > and 1 < A; < n, one has 

¥[Xk+i{H) - Xk{H) < n--^-'] < n-<'\ (1.9) 

This result does not assume a continuous distribution of the entries of H, just 
appropriate (exponential) moment assumptions. In particular, the eigenvalue 
gaps Xk+i{H) — Xk{H) are of the order at least n~2~^ with high probability. This 
order is optimal up to 5 in the exponent, but the polynomial probability bound 
^-c{6) jg Furthermore, (jl.2p and (|1.9p are results of somewhat different 

nature: (II. 2p establishes absolute delocalization of eigenvalues with respect to 
a given point z, while (II. 9p gives a relative delocalization with respect to the 
neighboring eigenvalues. 

Finally, recent universality results due to Tao and Vu |21tl22j allow to compare 
the distribution of Xk{H) to the distribution of Afc(G) where G is a symmetric 
matrix with independent A'^(0, 1) entries. These results also apply for matrices 
H with discrete distributions, although one has to assume that the first few 
moments (such as three or four) of the entries of H and of G are equal (so it 
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does not seem that this approach can be used for symmetric Bernoulh matrices). 
Also, such comparisons come at a cost of a polynomial, rather than exponential, 
probability error: 

P| mm |Afc(G)| < en-^l'^ - n^^-i/^j _ o(rr^) 

< P| mm \Xk{H)\ < en^^/^l 
< P| mm |Afc(G)| < en-^/^ + n-^-^/^j ^ 0(71-"). (1.10) 

(See Corollary 24 in [21j and its proof.) 

Remark 1.3. After the results of this paper had been obtained, the author was 
informed of an independent work by Nguyen jlO], which improved Costello-Tao- 
Vu's singularity probability bound (|1.2p for symmetric Bernoulli matrices to 

F{H is singular} = 0(n~^^) 

for every M > 0, where a constant implicit in O(-) depends only on M. The 
even more recent work by Nguyen [llj . which was announced a few days after 
the current paper had been posted, demonstrated that for every M > there 
exists K > such that 

p|mm|Afc(-H')| < n"^} < n"^. 

While Nguyen's results give weaker conclusions than the results in this paper, 
they hold under somewhat weaker conditions on the distribution than (H) (for 
example, the entries of H do not to have mean zero); see jllj for precise state- 
ments. 

Remark 1.4 (Optimality). Although the magnitude of the gap n~^/^ in Theo- 
rem [LT] is optimal, the form of (ll.ip and (II. 8p suggests that the exponent 1/9 is 
not optimal. Indeed, our argument automatically yields £1/^+'' for every 5 > 
(with constants C, c depending also on S). Some further improvement of the ex- 
ponent may be possible with a more accurate argument, but the technique of this 
paper would still not reach the optimal exponent 1 (in particular, due to losses 
in decoupling). Furthermore, we conjecture that the singularity probability term 
2e~"'^ in (jl.Sp may be improved to 2e~'^^. 



1.3 Four moments 

Even without subgaussian assumption (II. 4p on the entries of H, the bound on 
the spectral norm \\H\\ = max^ \ Xk{H)\ can be removed from (|1.3p . however this 
will lead to a weaker probability bound than in Theorem 11.21 
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Theorem 1.5 (Four moments). Let H be an n x n symmetric random matrix 
satisfying (H), whose off-diagonal entries have finite fourth moment Mf, and 
whose diagonal entries satisfy \hii\ < K^/n for some K . For every p > there 
exist nQ,£ > that depend only on the fourth moment of entries, K and p, and 
such that for all n > hq one has 

P| mm\\k{H) - z\< en^^/^l < p_ 

To see how this result follows from Theorem 11.11 note that a result of Latala 
implies a required bound on the spectral norm. Indeed, Lemma [2.4l and Markov's 
inequality yield \\H\\ = max^ |Afc(i^)| < (CM4 + K)^/n with high probability. 
Using this together with (jl.ip implies Theorem 11.51 

An immediate consequence of Theorem II. 5 1 is that such matrices H are asymp- 
totically almost surely non-singular: 

F[H is singular} < pn{M4,K) — as n — )• 00. 

Like Theorem 11.21 Theorem 11.51 also establishes the delocalization of eigen- 
values on the optimal scale n~^/^ and the bounds on the resolvent ()1.6|) . on the 
norm of the inverse and on the condition number (jl.7p - all these hold under just 
the fourth moment assumption as in Theorem 11.51 

1.4 Overview of the argument 

Decomposition into compressible and incompressible vectors. Let us 

explain the heuristics of the proof of Theorem 11.11 Consider the matrix A = 
H — zl. Note that mint \Xk{H) — z\ = min^ |Afc(A)| = min^^gn-i ||^x||2 where 
S'^~^ denotes the Euclidean sphere in R". So our task is to bound above the 
probability 

p| min IWxIb < en^^/H. 

In other words, we need to prove the lower bound ||Aa;||2 ^ ra^^/^ uniformly for 
all vectors x G S^~^, and with high probability. 

Our starting point is the method developed in [13] for a similar invertibility 
problem for matrices A with all independent entries, see also [15] . We decompose 
the sphere S"'~^ = CompUlncomp into the classes of compressible and incom- 
pressible vectors. A vector x is in Comp if x is within distance, say, 0.1 from the 
set of vectors of support O.ln. We seek to establish invertibility of A separately 
for the two classes, our goal being 

min \\Ax\\2>n^/'^, min \\Ax\\2> n'^/"^ . (1.11) 

xSComp xSlncomp 

(The first estimate is even stronger than we need.) Each of the two classes, 
compressible and incompressible, has its own advantages. 
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Invertibility for compressible vectors. The class Comp has small metric 
entropy, which makes it amenable to covering arguments. This essentially reduces 
the invertibility problem for Comp to proving the lower bound ||^x||2 ^ n^^'^ with 
high probability for one (arbitrary) vector x E Comp. If A had all independent 
entries (as in [T3]) then we could express as a sum of independent random 

variables X]fc=i(^fc' where A^ denote the rows of vl, and finish by showing 
that each is unlikely to be o(l). But in our case, A is symmetric, so A]^ 

are not independent. Nevertheless, we can extract from A a minor G with all 
independent entries. To this end, consider a subset / C [n] with |I| = An where 
A € (0, 1) is a small number. We decompose 

D G\ Iv 



where is a /'^ x J*^ matrix, G is a /'^ x / matrix, y £ I^, z £ I. Then 
ll^a^lb ^ \\Dy + Gz||2. Conditioning on the entries in D and denoting the fixed 
vector —Dy by u, we reduced the problem to showing that 

\\Ax\\2 > \\Gz - v\\2 > n^/^ with high probability. (1.13) 

Now G is a matrix with all independent entries, so the previous reasoning yields 
()1.13p with probability at least 1 — 2e~'^"'. This establishes the first part of our 
goal (jl.lip . i.e. the good invertibility of A on the class of compressible vectors. 



Concentration of quadratic forms. The second part of our goal (11. lip is 
more difficult. A very general observation from [13j reduces the invertibility 
problem for incompressible vectors to a distance problem for a random vector 
and a random hyperplane (Section 13. Sp . Specifically, we need to show that 

dist(Xi,iZi) > 1 with high probability, (1.14) 

where Xi denotes the first column of A and Hi denotes the span of the other 
n — 1 columns. An elementary observation (Proposition 15. ip is that 

dist{Ai,Hi) = — ^=_L===-, where A = ^ j . 

Obviously the random vector Z £ W^^^ and the (n— 1) x (n— 1) symmetric random 
matrix B are independent, and B has the same structure as A (its above-diagonal 
entries are independent). So lifting the problem back into dimension n, we arrive 
at the following problem for quadratic forms. Let X be a random vector in R" 
with iid coordinates with mean zero and bounded fourth moment. Show that for 
every fixed u G M, 

\{A-^X,X) -u\> P^^IIhs with high probability, (1.15) 



8 



where || • ||hs denotes the Hilbert-Schmidt norm. In other words, we need to show 
that the distribution of the quadratic form {A^^X, X) is spread on the real hne. 

The spread of a general random variable S is measured by the Levy concen- 
tration function 

C{S,£) ■.= supF{\S -u\<e}, e>0. 

So our problem becomes to estimate Levy concentration function of quadratic 
forms of the type {A~^X,X) where A is a symmetric random matrix, and X is 
an independent random vector with iid coordinates. 

Littlewood-Offord theory. A decoupling argument allows one to replace 
{A~^X, X) by the bilinear form {A^'^Y, X) where Y is an independent copy of X. 
(This is an ideal situation; a realistic decoupling argument will incur some losses 
which we won't discuss here, see Section r8.2l ) Using that E||A^-'^y||| = ||^~^||^g, 
we reduce the problem to showing that for every € M one has 

A~^Y 

|(xo,^) — u\>\ with high probability, where xq = .. . (1-16) 

11^ ^ II2 

By conditioning on A and X we can consider xq as a fixed vector. The product 

n 

S:= {xo,X) =Y,Mk)X{k) 
k=l 

is a sum of independent random variables. So our problem reduces to estimating 
Levy concentration function for general sums of independent random variables 
with given coefficients X(){k). 

It turns out that the concentration function depends not only on the mag- 
nitude of the coefficients xo{k), but also on their additive structure. A vector 
xq with less 'commensurate' coefficients tends to produce better estimates for 
C(S,e). Many researchers including Littlewood, Offord, Erdos, Moser, Sarkozi, 
Szemeredi and Halasz produced initial findings of this type; Kahn, Komlos and 
Szemeredi [7] found applications to the invertibility problem for random matri- 
ces. Recently this phenomenon was termed the (inverse) Littlewood-Offord theory 
by Tao and Vu pTO] . They initiated a systematic study of the effect the additive 
structure of the coefficient vector xq has on the concentration function; see a 
general discussion in \18\ [15] with a view toward random matrix theory. 

In [131 E]) Rudelson and the author of this paper proposed to quantify the 
amount of additive structure of a vector x E S"^"^ by the least common denomi- 
nator (LCD); the version of LCD we use here (due to Rudelson) is 

D{x)=mi^e > : dist(0x,Z") < ^b^}. (1.17) 
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The larger D{x), the less structure x has, the smaller C{S,£) is expected to be. 
Indeed, a variant of the Littlewood-Offord theory developed in [13\ [T^ states 
that 

C{S,e)<e + ^^, e>0. (1.18) 

The actual, more accurate, definition of LCD and the precise statement of (jl.lSp 
is given in Section [6. 1[ 

Additive structure. In order to use Littlewood-Offord theory, one has to show 
that D{xq) is large for the vector xq in (|1.16p . This is the main difficulty in this 
paper, coming from the symmetry restrictions in the matrix A. We believe that 
the action of on an (arbitrary) vector Y should make the random vector xq 
completely unstructured, so it is plausible that D{xq) > e"^" with high probability, 
where c > is a constant. If so, the singularity probability term in ()1.3p would 
improve to e"'^". Unfortunately, we can not even prove that D{xq) > e^^. 

The main losses occur in the process of decoupling and conditioning, which 
is performed to reduce the symmetric matrix A to a matrix with all independent 
entries. In order to resist such losses, we propose in this paper to work with an 
alternative (but essentially equivalent) robust version of LCD which we call the 
regularized LCD. It is designed to capture the most unstructured part of x of a 
given size. So, for a parameter A G (0, 1), we consider 

D{x,X) = max|L»(x//||a;/||2) : / C [n], \I\ = [An] | (1.19) 

where xj E denotes the restriction of vector x onto the subset /. The actual, 
more accurate, definition of regularized LCD is given in Section 16.21 

On the one hand, if D(x, A) is large, then x has some unstructured part 
xj, so we can still apply the linear Littlewood-Offord theory (restricted to /) 
to produce good bounds on the Levy concentration function for linear forms 
(Proposition 16. 9p . and extend this for quadratic forms by decoupling. On the 
other hand, if D{x, A) is small, then not only x/ but all restrictions of x onto 
arbitrary [An] coordinates are nicely structured, so in fact the entire x is highly 
structured. This yields a good control of the metric entropy of the set of vectors 
with small D(x,X). Ultimately, this approach (explained in more detail below) 
leads us to the desired structure theorem, which states that for A > n~^, one has 

-D(xo, A) > n'^/^ with high probability. (1.20) 

See Theorem 1 7. II for the actual statement. In other words, the structure theorem 
that the regularized LCD is larger than any polynomial in n. As we explained, 
this estimate is then used in combination with the Littlewood-Offord theory 
(jl.lSp to deduce estimate (jl.lSp for quadratic forms (after optimization in A); 



10 



see Theorem 18.11 for the actual result on concentration of quadratic forms. This 
in turn yields a solution of the distance problem (jl.l4p , see Corollary 19. li Ul- 
timately, this solves the second part of invertibility problem (jl.lip . i.e. for the 
incompressible vectors, and completes the proof of Theorem 11.11 

The structure theorem. The proof of structure theorem ()1.20p is the main 
technical ingredient of the paper. We shall explain heuristics of this argument in 
some more detail here. Let us condition on the independent vector Y in ()1.16p . 

By definition of xq, the vector Axq is co- linear with the fixed vector Y, so 
(apart from the normalization issue, which we ignore now) we can assume that 
Axq equals some fixed vector u G M". Then structure theorem ()1.20p will follow if 
we can show that, with high probability, all vectors x £ S"'~^ with D{x, A) <^ rfl^ 
satisfy Ax ^ u. 

To this end, fix some value D <^ n^^^ and consider the level set 

SD = {xe S""-^ : D{x,X) ~ D}. 

Our goal is to show that, with high probability, Ax ^ u for all x G So- This will 
be done by a covering argument. 

First we show an individual estimate, that for an arbitrary given x G So, 
Ax 7^ u with high probability. So let us fix x E Sd and assume that Ax = u. 
We choose the most unstructured subset of indices / of x, i.e. let / be the 
maximizing set in definition (I1.19P of the regularized LCD. The decomposition 
[n] = I^Ul induces the decomposition of matrix A we considered earlier in ()1.12p . 
Conditioning on the minor D, we estimate 

= \\Ax - u\\2 > \\Gz - v\\2 = ^ {{Gk,xi) - Vkf 

where v = {vi, . . . ,Vn) denotes some fixed vector (which depends on u the entries 
of D, which are now fixed), and denote the rows of the minor G. It follows that 
{Gk,xi) —Vk = for all k £ P. Since G has independent entries, the probability 
of these equalities can be estimated using a Littlewood-Offord estimate (jl.lSp as 

p{(Gfc,x,)-^;, = 0| <— A;e. 
I J D{xi) D{x,X) D 

Therefore, by independence we have 

¥{Ax = u}< (-j = (-) for all x G Sd- 

On the other hand, the level set Sd has small metric entropy 
first consider the level set of the usual LCD in (I1.17P : 

Td = {x£ 5""1 : D{x)^D}. 



(1.21) 

To see this. 
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Since the number of integer points in a EucUdean ball of radius D in M" is about 
(D/y^)", the definition of LCD implies that there exists an /3-net M of Td in 
the Euclidean metric with 



Now consider an arbitrary x S Sd- By definition of the regularized LCD, the 
restriction xj of any set / of An coordinates has D(x//||x/||2) ^ D. So we can 
decompose [n] into 1/A sets of indices Ij, \Ij\ = An, and for the restriction of x 
onto each Ij construct a /3-net Aij in M^^ with ^ (D/^/Xn)^" as above. 

The product of these nets Aij obviously forms a P/^/X-net M of Sd with 



\n' ' ^ V An 

Finally, we take a union bound of probability estimates (|1.2ip over all x in 
the net H oi Sd. This gives 



3x G A/" : Ax = u^<{^ 



/ 1 \ fi^-^" / D \ '^'- / Z) 



Therefore, if ^ (An)^/^ then the probability bound is exponentially small. 
An approximation argument (using the bound \\A\\ = 0{y/n)) extends this from 
the net M to the entire sub-level set So-, and a simple union bound over all 
D < [Xnf/^ finally yields 

P{3x e D{x,X) < (An)2/^ : Ax = < e"". 



As we said, this implies that with (exponentially) large probability, D[x, A) 
(An)^/^, which is essentially the statement of structure theorem (ll.20p . 
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2 Notation and initial reductions of the problem 
2.1 Notation 

Throughout this paper C, Ci, C2, c, ci, C2, . . . will denote positive constants. When 
it does not create confusion, the same letter (say, C) may denote different con- 
stants in different parts of the proof. The value of the constants may depend on 
some natural parameters such as the fourth moment of the entries of but it 
will never depend on the dimension n. Whenever possible, we will state which 
parameters the constant depends on. 

The discrete interval is denoted [n] = {!,..., n}. The logarithms logo are 
natural unless noted otherwise. 
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F{£} = Fx,y{^} stands for the probability of an event £ that depends on the 
values of random variables, say, X and Y. Similarly, Kf{X,Y) = E,x,y fiX,Y) 
stands for the expected value of a certain function f{X,Y) of random variables 
X and Y. 

For a vector x = (xi, . . . , Xn) £ M", the Euclidean norm is ||3;||2 = ( Ylk=i l^fcP) 
and the sup- norm is ||x||oo = max^ \ xk\. The unit Euclidean sphere is S"^^^ = 
{x : \\x\\2 = 1} and the unit Euclidean bah is = {x : \\x\\2 < 1}. 

The Euclidean distance from a point x G M" to a subset D C is denoted 
dist(x,r) = inf{||x-t||2 : teT}. 

Consider a subset I Q [n]. The unit Euclidean ball in W is denoted B^. The 
orthogonal projection in M" onto is denoted Pj : M" — )• R". The restriction 
of a vector x = (xi, . . . , x„) E onto the coordinates in / is denoted xj. Thus 
Pjx is a vector in R" (with zero coordinates outside /), while xj = {xk)kGi is a 
vector in R^. 

Let A be an n X n symmetric matrix. The eigenvalues of A arranged in a 
non-decreasing order are denoted Xk{A). The spectral norm of A is 

max \ Xk{A)\ = max Px||2 = (2.1) 

k a;e5"-l 

The eigenvalue of the smallest magnitude determines the norm of the inverse: 

min|Afc(^)|= min px||2 = l/||vl~^||. (2.2) 

k xeS"-i 

The transpose of A is denotes A* . The Hilbert-Schmidt norm of A is denoted 

P||hs = (Ea.(^)^)'^'. 

k=l 

2.2 Nets and bounds on the spectral norm 

Consider a compact set T G R" and e > 0. A subset A/" C T is called an e-net 
of T if for every point t £ T one has dist(t,AA) < e. The minimal cardinality of 
an e-net of T is called the covering number of T (for a given e), and is denoted 
N{T,£). Equivalently, N{T,£) is the minimal number of closed Euclidean balls 
of radii e and centered in points of T, whose union covers T. 

Remark 2.1 (Centering). Suppose T can be covered with balls of radii e, but 
their centers are not necessarily in T. Then enlarging the radii by the factor of 
2, we can place the centers in T. So N{T, 2e) < A^. 

Lemma 2.2 (See e.g. [23], Lemma 2). For every subset T C S^^^ and every 
e G (0, 1], one has 

N{T,e)<{3/er. 
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The following known lemma was used to deduce Theorem 1 1.2 1 for subgaussian 
matrices from our general result, Theorem ll.il 

Lemma 2.3 (Spectral norm: subgaussian). Let H be a symmetric random matrix 
as in Theorem Then 

F^\\H\\ < (C^3M + iC)Vn} > 1-26"", 
where is an absolute constant. 

Proof. Let us decompose the matrix as H = D + B + B* where D is the diagonal 
part of if, and B is the above-diagonal part of H. Since ||-D|| < K^fn by 
assumption and = ||i?*||, we have ||ii|| < i('y^-|- 2||i3||. Furthermore, since 
the entries of B on and below the diagonal are zero, all r? entries of B are 
independent mean zero random variables with subgaussian moments bounded 
by M. Proposition 2.4 of [15j then implies a required bound on ||i?||: 

P{||S|| < CM^/^ > 1 - 2e-", 

where C is an absolute constant. This completes the proof. □ 

A similar spectral bound holds just under the fourth moment assumption, 
although only in expectation. 

Lemma 2.4 (Spectral norm: four moments). Let B be a symmetric random 
matrix as in Theorem \1.5[ Then 

E\\H\\ < {q^^ + K)V^, 

Here Qt4\ is an absolute constant. 

Proof. We use the same decomposition H = D + B + B* as in the proof of 
Lemma 12.31 A result of Latala [8] implies that E||-B|| < CM4 where C is an 
absolute constant. Thus 

E||F|| < ||D|| + 2E||B|| < (i^ + 2CM4)Vra. 

The lemma is proved. □ 

2.3 Initial reductions of the problem 

We are going to prove Theorem 11.11 Without loss of generality, we can assume 
that X > 1 by increasing this value. Also we can assume that the constant c in 
this theorem is sufficiently small, depending on the value of the fourth moment 
and on K. Consequently, we can assume that n > uq where uq is a sufficiently 
large number that depends on the fourth moment and on K. (For n < uq the 
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probability bound in ()1.3p will be larger than 1, which is trivially true.) By a 
similar reasoning, we can assume that e G (0,eo) for a sufficiently small number 
eo > which depends on the fourth moment and on K. 

So we can assume that K^/n > en~^/'^. Therefore, for \z\ > 2Ky/n the 
probability in question is automatically zero. So we can assume that \z\ < 2Ky/n. 

We shall work with the random matrix 

A = H-zI. 

If \\H\\ = maxfc |Ayt(iJ)| < K^/n as in ((L3l) then < \\H\\ + \z\ < 3K^/n. 
Therefore, the probability of the desired event in (jl.3p is bounded above by 

p := P| mm\\k{A)\ < erT^I'^ A£k^ 

where Ek denotes the event 

£K = {\\A\\<'iK^). (2.3) 
Using (j2.2p , we see that Theorem 11.11 would follow if we prove that 

p:=p| min px||2 < en^^/^ ^T/^j < Ce^/^ + 26""'. (2.4) 

We do this under the following assumptions on the random matrix A: 

(A) A = (ttij) is an n X n real symmetric matrix. The above-diagonal entries 
Oij, i < 3, are independent and identically distributed random variables 
with 

Eoij = 0, Ea| = 1, Eafj < for j > i, (2.5) 

where M4 is some finite number. The diagonal entries arbitrary 
fixed numbers. 

The constants C and c > in (12. 4p will have to depend only on K and M4. 

By a small perturbation of the entries of A (e.g. adding independent normal 
random variables with zero means and small variances), we can assume that 
the distribution of the entries Ojj is absolutely continuous. In particular, the 
columns of A are in a general position almost surely. So the matrix A as well as 
all of its square minors are invertible almost surely; this allows us to ignore some 
technicalities that can arise in degenerate cases. 

3 Preliminaries: small ball probabilities, compress- 
ible and incompressible vectors 

In this section we recall some preliminary material from |13[ [14] . 
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3.1 Small ball probabilities, Levy concentration function 

Definition 3.1 (Small ball probabilities). Let Z be a random vector in M". The 
Levy concentration function of Z is defined as 

C{Z, e) = sup 

The Levy concentration function bounds the small ball probabilities for Z, 
which are the probabilities that Z falls in a Euclidean ball of radius e. 

A simple but rather weak bound on Levy concentration function follows from 
Paley-Zygmund inequality. 

Lemma 3.2 ([14J, Lemma 3.2). Let Z be a random variable with unit variance 
and with finite fourth moment, and put := E(Z — KZ)^. Then for every 
e G (0, 1) there exists p = piM^, e) E (0, 1) such that 

C(i,e)<p. 

There has been a significant interest in bounding Levy concentration function 
for sums of independent random variables; see |13l [HI [15] for discussion. The 
following simple but weak bound was essentially proved in [13], Lemma 2.6 (up 
to centering). 

Lemma 3.3 (Levy concentration function for sums). Let ^i, . . . indepen- 
dent random variables with unit variances and E(^fc — E^^)^ < Mf, where M4 is 
some finite number. Then for every e E (0, 1) there exists p = p(M4,e) E (0, 1) 
such that the following holds. 

For every vector x = {xi, . . . , x„) E the sum S = Y^^=i ^kCk satisfies 

C{S,e)<p. 

Proof. Clearly S has unit variance. Furthermore, since S — KS = X^^^x x^iCk — 
E^fc), an application of Khinchine inequality yields 

E{S - E5)^ < CM|, 

where C is an absolute constant (see [13j, proof of Lemma 2.6). The desired 
concentration bound then follows from Lemma 13.21 with Z = S — ES". □ 

The following tensorization lemma can be used to transfer bounds for the Levy 
concentration function from random variables to random vectors. This result 
follows from [13], Lemma 2.2 with = \xk — Uk\, where u = (ui, . . . , Un) E M"'. 

Lemma 3.4 (Tensorization). Let X = {Xi, . . . ,Xn) be a random vector in W" 
with independent coordinates X^. 
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1. Suppose there exists numbers eo ^ and L > such that 

C{Xk, e) < Le for all e > Eq and all k. 

Then 

C{X,ey/^) < (q^jLe)" for all e > Eq, 
where is an absolute constant. 

2. Suppose there exists numbers e > and p G (0, 1) such that 

C{Xii., e) < p for all k. 
There exists numbers Ei = Ei{e,p) > and pi = Pi{e,p) £ (0, 1) such that 

C{X,ei^)<p1. 

Remark 3.5. A useful equivalent form of Lemma 13.41 (part 1) is the following one. 
Suppose there exist numbers a, 6 > such that 

e) <aE + b for all e > and all k. 

Then 

C{X, e) < [(%5pe + b)] " for all e > 0, 
where Qxsi is an absolute constant. 

3.2 Compressible and incompressible vectors 

Let co,ci G (0,1) be two numbers. We will choose their values later as small 
constants that depend only on the parameters K and M4 from (j2.4p and (A), 
see Remark 14.31 below. 



Definition 3.6 ([13], Definition 2.4). A vector x G is called sparse if\ supp(a;)| < 
Con. A vector x G 5*^"^ is called compressible if x is within Euclidean distance 
c\ from the set of all sparse vectors. A vector x G S^~^ is called incompressible 
if it is not compressible. 

The sets of compressible and incompressible vectors in 5""^ will be denoted 
by Comp(co,ci) and Incomp(co, ci) respectively. 

The classes of compressible and incompressible vectors each have their own 
advantages. The set of compressible vectors has small covering numbers, which 
are exponential in cqu rather than in n: 

Lemma 3.7 (Covering compressible vectors). One has 

iV(Comp(co,ci),2ci) < (9/coci)'=«". 
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Proof. Let s = [cq^J . By Lemma 12.21 t^i^ unit sphere S"^~^ of M'* can be covered 
with at most (3/ci)* Euchdean bahs of radii ci. Therefore, the set S of sparse 
vectors in M" can be covered with at most (")(3/ci)'' Euchdean baUs of radii ci 
centered in S. Enlarging the radii of these bahs we conclude that Comp(co,ci) 
can be covered with at most (")(3/ci)* Euclidean balls of radii 2ci centered in 
S. The conclusion of the lemma follows by estimating < (en/s)^, which is a 
consequence of Stirling's approximation. □ 

The set of incompressible vectors have a different advantage. Each incom- 
pressible vector X has a set of coordinates of size proportional to n, whose magni- 
tudes are all of the same order n~^^'^. We can say that an incompressible vector 
is spread over this set: 

Lemma 3.8 (Incompressible vectors are spread, |13|, Lemma 3.4). For every 
X G Incomp(co, Cl), one has 

< \Xk\ < 



r- — ^ -^K ^ , 

V2n VCon 
/or at least ^Cgcfn coordinates Xk of x. 

Since S""^ can be decomposed into two disjoint sets Comp(co, ci) and Incomp(co 
the problem of proving ()2.4p reduces to establishing the good invertibility of the 
matrix A on these two classes separately: 

min Px||2 < en"^/2 A£:ir| < P( inf \\Ax\\2 < erT^^'^ ^ £k\ 

^.g^n-l J 1^ 3::eComp(co,ci) J 



+ P| inf \\Ax\\2<en-'^/^hSK\- (3.1) 

L a;£lncomp(co,ci) J 



3.3 Invertibility for incompressible vectors via the distance prob- 
lem 

The first part of the invertibility problem (jS.ip . for compressible vectors, will be 
settled in Section [H The second part, for incompressible vectors, quickly reduces 
to a distance problem for a random vector and a random hyperplane: 

Lemma 3.9 (Invertibility via distance, [13J, Lemma 3.5). Let A he any n x n 
random matrix. Let Ai, . . . ,An denote the columns of A, and let denote the 
span of all columns except the k-th. Then for every co,ci G (0,1) and every 
e >0, one has 

P( inf Px||2 < en^^/^l < V P{ dist(^fc,i/fc) < c7^e}. (3.2) 

a;elncomp(co,ci) J CqU 

k=l 
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This reduces our task to finding a lower bound for dist(74fc, H^). Tliis distance 
problem will be studied in the second half of the paper following Section [H 

Remark 3.10. Since the distribution of a random matrix A is completely general 
in Lemma 13.91 by conditioning on £k we can replace the conclusion (j3.2p by 

1 " 

L xelncomp(co,ci) J CqTI ^-^ 

k=l 



4 Invertibility for compressible vectors 

In this section we establish a uniform lower bound for ||j4x||2 on the set of com- 
pressible vectors x. This solves the first part of the invertibility problem in (|3.ip . 



4.1 Small ball probabilities for Ax 

We shall first find a lower bound for ||Ax||2 for a fixed vector x. We start with a 
very general estimate. It will be improved later to a finer result, Proposition 16. 1 ll 
which will take into account the additive structure of x. 

Proposition 4.1 (Small ball probabilities for Ax). Let A be a random matrix 
which satisfies (A). Then for every x G S^~^ , one has 

£{Ax,(^jj^) < 2e"^. 

Here > depends only on the parameter M4 from assumptions (|2.5p . 

Proof. Our goal is to prove that, for an arbitrary fixed vector u G M", one has 

F{\\Ax - u\\l < cjlif } < 2e"'EIP. 

Let us decompose the set of indices [n] into two sets of roughly equal sizes, 
{1, . . . , no} and {no + 1, . . . , n} where no = \n/2] . This induces the decomposi- 
tion of the matrix A and both vectors in question, which we denote 

This way, we express 

\\Ax - u\\l = \\Dy + Gz- v\\l + \\G*y + Ez- w\\l. (4.1) 

We shall estimate the two terms separately, using that each of the matrices G 
and G* has independent entries. 
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We condition on an arbitrary realization of D and and we express 

n 

\\Dy + Gz-v\\l = Y,{{G,,z)-d,f 
i=i 

where Gj denote the rows of G and dj denote the coordinates of the fixed vector 
Dy — V. For each j, we observe that {Gj, z) = ^11=710+1 o-ij^i is a sum of inde- 
pendent random variables, and XlILno+i ■^^ ~ INIIi- Therefore Lemma 13.31 can 
be applied to control the small ball probabilities as 

^((G„^>,^)<C3G(0,1) 

where C3 depends only on the parameter M4 from assumptions (|2.5p . 

Further, we apply Tensorization Lemma 13.41 (part 2) for the vector G2/||z||2 
with coordinates {Gj, z/\\z\\2) , j = 1, ■ ■ ■ ,nQ. It follows that there exist numbers 
C2 > and C3 G (0, 1) that depend only on M4 and such that 

C{Gz,C2\\z\\2^/n^) = C{Gz/\\z\\2,C2^/m) < C3°. 

Since Dy — t; is a fixed vector, this implies 

F{\\Dy + Gz - v\\l <4\\z\\lno} < C3O. (4.2) 

Since this holds conditionally on an arbitrary realization of D, E, it also holds 
unconditionally. 

By a similar argument we obtain that 

F{\\G*y + Ez-wf2<cl\\y\\Un-no)} <c'^-''\ (4.3) 

Since no > n/2 and n—no ^ n/3 and Hylli + ll-^Hi = ll^^lll = 1) '^^ have \\z\\2 no+ 
^2 llylli ('^ ~ ''^0) > ^c^n. Therefore, by (j4.ip . the inequality \\Ax — u\\2 < ^c^n 
implies that either the event in ()4.2p holds, or the event in (14. 3[) holds, or both. 
By the union bound, we conclude that 

P{ \\Ax - ug < ic^n} < + cr"« < 2c^/l 
This completes the proof. □ 

4.2 Small ball probabilities for Ax uniformly over compressible 

X 

An approximation argument allows us to extend Proposition 14.11 to a uniform 
invertibility bound on the set of compressible vectors x uniformly. The following 
result gives a satisfactory answer for the first part of the invertibility problem in 
()3.ip . i.e. for the set of compressible vectors. We shall state a somewhat stronger 
result that is needed at this moment; the stronger form will be useful later in the 
proof of Lemma 17.21 
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Proposition 4.2 (Small ball probabilities for compressible vectors). Let A be an 
nxn random matrix which satisfies (A), and let K > 1. There exist cq, ci, cj^n G 
(0, 1) that depend only on K and M4 from assumptions (|2.3p . (j2.5p . and such 
that the following holds. For every u G M", one has 

P| inf Px-ulla/lklls < qW^Afx) < 26-111^. (4.4) 

|I^6Comp{co,ci) ' J 

Proof. Let us fix some small values of cq, ci and the precise choice will be 
made shortly. According to Lemma 13.71 there exists a (2ci)-net M of the set 
Comp(co, ci) such that 

W\ < (9/coCi)^o". (4.5) 

Let £ denote the event in the left hand side of (|4.4p whose probability we would 
like to bound. Assume that S holds. Then there exist vectors xq := x/||x||2 G 
Comp(co,ci) and uq := n/||x||2 G span(ti) such that 

\\Axo - uo\\2 < qojv/n. (4.6) 
By the definition of M, there exists yo £ M such that 

Iko - yolb < 2ci. (4.7) 
On the one hand, by definition (j2.3p of event £k, we have 

\\Ayoh < \\M < 3K^. (4.8) 
On the other hand, it follows from (j4.6p and ()4.7p that 

W^yo -uoh < \\A\\\\xo -yoh + W^xq -uolb < 6ciKVn + qop/n. (4.9) 
This and (gSD yield that 

\\uo\\2 < SKy/n + 6ciKy/n + qpix/ra < lOi^A/n- 

So, we see that 

G span(u) n lOK^/nB^ =: -E. 
Let Ai be some fixed (ci-ftr-y/n)-net of the interval E, such that 

IMI<?^^^. (4.10) 

Let us choose a vector vq £ such that ||no — fo||2 < cii^-^/n- It follows from 
that 

- ^^olb < 6clK^/n + qop/n + cii^A/n < {7ciK + qoi)\/n. 
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Choose values of ci,q4;2] G (0) 1) so that 7ciK + < grrt where c jxn is the 
constant from Proposition 14.11 

Summarizing, we have shown that the event £ implies the existence of vectors 
HQ ^ M and vq ^ M. such that \\AyQ — folb < (frm/n. Taking the union bound 
over Af and M. , we conclude that 

P(^:) < |AA| • \M\ max F{\\Ayo - voh < 

Applying Proposition 14. 1 1 and using the estimates ()4.5p . (I4.10p on the cardinalities 
of the nets, we obtain 

/ 9 \ con 20 ^ 

ff'(f)<( — ) 2e-'EIF. 

Vcoci/ ci 

Choosing cq > small enough depending on ci and qjjj, we can ensure that 

¥{£) < 26-123^/2 

as required. This completes the proof. □ 

As an immediate consequence of Proposition l4.21 we obtain a very good bound 
for the first half of the invertibility problem in (j3.ip . Indeed, since en~^/^ < 
qoK/n, we have 

P| inf \\Ax\\2<en-^/'^ A£k} <2e-'^. (4.11) 

Remark 4.3 (Fixing cq, ci). At this point we fix some values cq = co(i^, M4) and 
ci = ci(-fr, M4) satisfying Proposition 14. 2| for the rest of the argument. 

5 Distance problem via small ball probabilities for 
quadratic forms 

The second part of the invertibility problem in (jS.ip - the one for for incompress- 
ible vectors - is more difficult. Recall that Lemma 13.91 reduces the invertibility 
problem to the distance problem, namely to an upper bound on the probability 

P{ dist(^i,i/i) < e} 

where Ai is the first column of A and Hi is the span of the other columns. (By a 
permutation of the indices in [n], the same bound would hold for all dist(Afc, Hi^) 
as required in Lemma 13.91 ) 

The following proposition reduces the distance problem to the small ball 
probability for quadratic forms of random variables: 
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Proposition 5.1 (Distance problems via quadratic forms). Let A = (aij) be an 
arbitrary n x n matrix. Let A\ denote the first column of A and Hi denote the 
span of the other columns. Furthermore, let B denote the (n — 1) x (n— 1) minor 
of A obtained by removing the first row and the first column from A, and let 
X G M"^-*^ denote the first column of A with the first entry removed. Then 

\{B-^X,X) - an I 
distMi,i/i) = '\ ' ' 

^ ' y^l +\\B-^XWi 

Proof. Let h € S'"'~^ denote a normal to the hyperplane Hi; choose the sign of 
the normal arbitrarily. We decompose 

where /ii G M and 5 e M""^. Then 

dist(^i,Fi) = \{Ai,h)\ = \aiihi + {X,g)\. (5.1) 
Since h is orthogonal to the columns of the matrix ( ^ ) , we have 

= f h = hiX + Bg, 



so 



-hiB-^X. (5.2) 



Furthermore, 
Hence 



\l = ''1 + llslli = ''1 + hUB-'Xf 



So, using (j5.2p and ()5.3p . we can express the distance in ()5.ip as 



dist(yli,i7i) = \auhi - {hiB-^X,X 



\{B-^X,X) -aii\ 
^/l + \\B-^X\\i 



This completes the proof. □ 

Remark 5.2 {A versus B). Let us apply Proposition 15.11 to the nx n random 
matrix A which satisfies assumptions (A). Recall that an is a fixed number, so 
the problem reduces to estimating the small ball probabilities for the quadratic 
form {B~^X, X) . Observe that X is a random vector that is independent of B, 
and whose entries satisfy the familiar moment assumptions (12. 5p . 

The random matrix B has the same structure as A except it is (n — 1) x 
(n — 1) rather than n x n. For this reason, it will be convenient to develop the 
theory in dimension n, that is for the quadratic forms {A~^X., X), where X is an 
independent random vector. At the end, the theory will be applied in dimension 
n — 1 for the matrix B. 
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6 Small ball probabilities for quadratic forms via ad- 
ditive structure 

In order to produce good bounds (super-polynomial) for the small ball probabil- 
ities for the quadratic forms {A~^X,X), we will have to take into account the 
additive structure of the vector A^^X. Let us first review the corresponding 
theory for linear forms, which is sometimes called the Littlewood-Offord theory. 
We will later extend it (by decoupling) to quadratic forms. 

6.1 Small ball probabilities via LCD 

The linear Littlewood-Offord theory concerns the small ball probabilities for the 
sums of the form ^ x^^^k where are identically distributed independent random 
variables, and x = (xi, . . . ,x„) S S""^ is a given coefficient vector. Lemma [3^ 
gives a general bound on the concentration function, C{S, e) < p. But this bound 
is too weak - it produces a fixed probability p for all e, even when e approaches 
zero. Finer estimates are not possible for general sums; for example, the sum 
S = ±1 lb 1 with random independent signs equals zero with fixed probability 
1/2. Nevertheless, one can break the barrier of fixed probability by taking into 
account the additive structure in the coefficient vector x. 

The amount of additive structure in x S M"' is captured by the least common 
denominator (LCD) of x. If the coordinates = Pk/lk are rational numbers, 
then a suitable measure of additive structure in x is the least denominator D(x) 
of these ratios, which is the common multiple of the integers qk- Equivalently, 
D{x) the smallest number 6 > such that 9x G Z". An extension of this concept 
for general vectors with real coefficients was developed in [l3l|Tl], see also [15]; 
the particular form of this concept we shall use here is proposed by M. Rudelson 
(unpublished). 

Definition 6.1 (LCD). Let L > 1. We define the least common denominator 
(LCD) ofxe as 

Dl{x) = inf > : dist(0x,Z") < L^log+(0/L)}. 

// the vector x is considered in for some subset / C [n], then in this definition 
we replace by Z,^ . 

Clearly, one always has Dl{x) > L. A more sensitive but still quite simple 
bound is the following one: 

Lemma 6.2. For every x E 5""^ and every L > 1, one has 



24 



Proof. Let := Dl(x), and assume that 9 < ttttti — • Then ||0x||oo < 1/2. There- 

z||a;||oc III! / 

fore, by looking at the coordinates of the vector 9x one sees that the vector 
p E Z" that minimizes \\9x — p\\2 is p = 0. So 

dist(ea;,Z") = \\9x\\2 = 9. 

On the other hand, by the definition of LCD, we have 



dist(fe,Z") < LJ\og^{9/L). 



However, the inequahty 9 < Ly/log_^_{9/L) has no solutions in ^ > 0. This 
contradiction completes the proos. □ 

The goal of our variant of Littlewood-Offord theory is to express the small 
ball probabilities of sums C{S,e) in terms of D{x). This is done in the following 
theorem, which is a version of results from [13^114): this particular simplified form 
is close to the form put forth by M. Rudelson (unpublished). 

Theorem 6.3 (Small ball probabilities via LCD). Let ^i, - ■ ■ ,^,n be independent 
and identically distributed random variables. Assume that there exist numbers 
eo,po,Mi > such that £(^jt,eo) < 1 — Po o,iT'd ^\^k\ < Mi for all k. Then there 
exists which depends only on Sq, po and Mi, and such that the following 
holds. Let x £ S"^^^ and consider the sum S = Yl^=i-^kS,k- Then for every 
L > Pq ^^'^ and e > one has 

The proof of Theorem 16.31 is based on Esseen's Lemma, see e.g. [17j, p. 290. 
Lemma 6.4 (C.-G. Esseen). Let Y be a random variable. Then 



£{¥,!)< q^l'^ 



\^Yi9)\d9 



where <j)Y{9) = Eexp(27ri0y) is the characteristic function ofY, and C^ej\ is an 
absolute constant. 

Proof of Theorem 1 6. 31 By replacing with £,k/£o-, we can assume without loss 
of generality that eo = 1- We apply Esseen's Lemma 16.41 for Y = S/e. Using 
independence of ^a,., we obtain 

1 " 

I / \ I I 1. \ 

d9, (6.1) 



c{s,e)<q^[ n|.^(^) 



25 



where (j){t) = Eexp(27rif^) is the characteristic function of ^ := Ci- 

We proceed with a conditioning argument similar to the ones used in \12\ 
[131 El- Let ^' denote an independent copy of ^, and let = ^ — then ^ is a 
symmetric random variable. By symmetry, we have 



\(p{t)\'^ = Eexp(27ritO = IEcos(27rt^). 

Using the inequality |x| < exp [ — ^(1 — x^)] which is valid for all x G M, we 
obtain 

|(/>(t)| < exp - -(1 - Ecos(27rtO) • (6.2) 

By assumption, we have 1) < 1 — po- Conditioning on ^ we see that 
1P{|C| ^ 1} ^ Po- Furthermore, another assumption of the theorem implies that 
E|^| < 2E|^| < 2Mi. Using Markov's inequality, we conclude that P{|^| > 
4Mi/po} < Po/2. Combining the two probability bounds, we see that the event 



S := {l<\S.\< Co} satisfies F{S} > po/2, where Co := 



4Mi 
Po 



We then estimate the expectation appearing in 



by conditioning on £: 



1 - Ecos(27rt^) > ¥{£} • E[1 - cos(27rtO | 



>^.E 
- 2 

= 8poIE 



min|27rtf - 2™^ \£ 



min \t£^ — q\ \£ 



Substituting this into (16. 2p and then into (|6.ip . and using Jensen's inequality, we 
obtain 



£{S,e)<q^ exp -4poE 



mm 



de 



k=l 



< 



exp 



4po dist ('^x,Z" 



d0 



£ 



Since the integrand is an even function of 0, we can integrate over [0, 1] instead 
of [—1, 1] at the cost of an extra factor of 2. Also, replacing the expectation by 
the maximum and using the definition of the event £, we obtain 



C{S,e)<2C^ sup [\^p{-Apof^{9))d9 

l<z<Co Jo 



(6.3) 



where 



/,(0) = dist(^x. 
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Suppose that 

Co 

£ > £0 



Dl{x) 

Then, for every 1 < z < Cq and every 6 G [0, 1], we have ^ < Dl{x). By the 
definition of Dl{x), this means that 



fM = dist > L^log^ g^). 

Putting this estimate back into (j6.3p . we obtain 

£{S, e) < 2(%4]sup ^ exp ( - 4poi^ log+ (|^) ) d^. 

After change of variable i = |f and using that z > 1 we have 

C{S, e) < 2q^e J exp ( - ApoL'^ log+ 1) dt = 2C^e(l + J t'^P^^^ dVj . 

Since PqL'^ > 1 by assumption, the integral in the right hand side is bounded by 
an absolute constant, so 

C{S,£) < CiLe 

where Ci is an absolute constant. 

Finally, suppose that £ < £q. Applying the previous part for 2eo, we get 

C{S,£) < C{S,2£o) < 2C,L£o = 

Dl{x) 

This completes the proof of Theorem I6.3[ □ 

Remark 6.5. For a general, not necessarily unit vector x € M", the conclusion of 
Theorem 16.31 reads as 



x\\2 \\x\\2'' ■ Ml^^lb Dl{x/\\x\\2) 

6.2 Regularized LCD 

As we saw in Proposition 15. H the distance problem reduces to a quadratic 
Littlewood-Offord problem, for quadratic forms of the type Xij^i^j. We will 
seek to reduce the quadratic problem to a linear one by decoupling and condi- 
tioning arguments. This process requires a more robust version of the concept of 
the LCD, which we develop now. 
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Let X G Incomp(co, ci); recall that we have fixed the values cq = co(i^, M4), 
ci = ci(-fr, M4) in Remark l4.3i By Lemma [STSl at least ^cocfn coordinates Xk of 
X satisfy 

< \xk\ < (6.4) 



/2n A/Con 
Let us fix some constant Coo such that 

-coci < Coo < ^; 

we can make the value of Cqo depend only on cq and ci (hence only on parameters 
K and M4). Then for every vector x £ Incomp(co, Ci) we can assign a subset 
called spread(x) C [n] so that 

|spread(x)| = [coo^] 

and so that (j6.4p holds for all k G spread(x). 

The point here is that not all of the coordinates Xk satisfying (j6.4p will be 
good in the future; the set spread(x) will allow us to include only the good ones. 
At this point, we consider an arbitrary valid assignment of spread(j;) to x; the 
particular choice of the assignment will be determined later. 

Our new version of LCD is designed to capture the amount of structure in 
the least structured part of the coefficients of x. 

Definition 6.6 (Regularized LCD). Let A G (0,Coo) and L > 1. We define the 
regularized LCD of a vector x £ Incomp(co, ci) as 

Dl{x, X) = max^Dil^Xf /\\xi\\2) ■ / C spread(x), |/| = [An]|. 
Denote by I{x) the maximizing set I in this definition. 

Remark 6.7. Since the sets / in this definition are subsets of spread(x), inequal- 
ities ()6.4p imply that 

q6T]\/A < ||x/||2 < QtiV^ 
where c^=ci/^/2 and (%7]= l/\/co- 

Lemma 6.8. For every x G Incomp(co, ci) and every A G (0,Coo) and L > 1, 
one has 

Dl{x,X) > qn:g/Xn. 
Here qojG (0, 1) depends only on cq and c\. 
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Proof. Consider a subset / as in the definition of Dl{x,X). Denote zj := 
xi/\\xj\\2- By ()6.4p and Remark 16.71 we have H-Z/Hoo < C/\/An where C G (0,1) 
depends only on cq and ci. Then Lemma 16.21 implies that 

Dl{zi) >^^- 

By the definition of Dl{x, A), the proof is complete. □ 

Now we state a version of Theorem 16.31 for the regularized LCD. 

Proposition 6.9 (Small ball probabilities via regularized LCD). Let - ■ ■ ,(,n 
be independent and identically distributed random variables. Assume that there 
exist numbers £o,po > such that £(^^,£9) ^ ^ — Po o.'^^d ]E|^/fc| < Mi for all k. 
Then there exist which depends only on £q, pq, and Mi, and such that the 
following holds. 

Consider a vector x G Incomp(co, ci) and a subset J C [n] such that J 5 I{x). 
Consider also the sum Sj = J2keJ ^kCk- Then for every X G (0, Cqo), L > p^ ^^"^ 
and £ >0, one has 

c{Sj,£)<q^(^+ ^ 



Dl{x,X) 

Proof. Note that for every two sets / ^ J C [n], the corresponding sums satisfy 
C{Sj,£) < C{Si,£); this follows by conditioning on the random variables with 
A; G J \ /. Applying this relation for / := I{x) C J, we obtain 

C{Sj,£) < C{Si,£) < q^(-±- + ] ) (by Remark [63]) 

' V||x/||2 Dl{xi/\\xi\\2)'' 

< + ^ } ) (by Remark EH ■ 

^qoVA Dl{x,X)^ 

This completes the proof. □ 

Remark 6.10. By Lemma 13.21 both Theorem 16.31 and Proposition 16.91 can be 
applied for arbitrary independent and identically distributed random variables 
^1, . . . that have unit variance and finite fourth moment. In particular, The- 
orem 16.31 and Proposition 16.91 apply if satisfy the same moment assumptions 
(|2.5p as the entries Uij of A. The constants Qoiand Qoiin this case depends only 
on the fourth moment parameter M4 from the assumptions (A) on the random 
matrix A. 
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6.3 Small ball probabilities for Ax via regularized LCD 

We will now develop a refinement of Proposition 14.11 that is sensitive to the 
additive structure of the vector x. 

Proposition 6.11 (Small ball probabilities for Ax via regularized LCD). Let A 
be a random matrix which satisfies (A). Let x G Incomp(co, ci) and A G (0, Cqo)- 
Then for every L > Lq and e > 0, one has 

C{Ax,e^/n) < 

Here Qrm o,nd Lq depend only on the parameters K and M4 from assumptions 

Proof. Our goal is to bound above the probability 

^{\\Ax - u\\2 < ey/n] 

for an arbitrary fixed vector u G M". 

Let I = I{x) be the maximizing set from the definition of Dl{x,X). We 
decompose the set of indices [n] into sets / U J*^ similarly to how we did it in the 
proof of Proposition 14.11 This induces the decomposition of the matrix A and 
both vectors in question, which we denote 

where D is a I"^ x L"^ matrix, G is a /'^ x / matrix, y,v ^ M.^" and z,w ^ M^. This 
way, we express 

\\Ax - u\\l = \\Dy + Gz- v\\l + \\G*y + Ez - w\\l. 

Let us condition on an arbitrary realization of the minors D and E. Denoting 
uq := V — Dy, we have 

\\Ax — u\\2 > \\Gz — Uo\\2- 

We will use the crucial facts that G is a. x L matrix with independent entries, 
and no is a fixed vector in M^" . The i-th coordinate of the vector Gz G M^'' is 

{Gz)i = '^aijXj, iGL"". 

All random variables Ojj here are independent. So we can apply Proposition 16.91 
with J = I = I{x) (see Remark l6.10j) . and we obtain 

£((G-).,e)<(i^(-£= + _L^). ier. 



+ 



7t — I /\1L I 



Dl{x,\) 
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Since the coordinates {Gz)i of the random vector Gz are independent, Tensoriza- 
tion Lemma 13.41 (see Remark 13. 5p imphes that 

C{Gz,ey/\F\) < 

where C depends on Cfe-g] only. This concludes the proof since \P\ = n — [An] > 
n/2. □ 



CLe CL 

+ 

VA L>l(x,A) 



7 Estimating additive structure 

Recall that our goal is to estimate the small ball probabilities for the quadratic 
forms of the type (^A~^X,X) . In accordance with the spirit of Littlewood-Offord 
theory, we will first need to estimate the amount of additive structure in the 
random vector A~^X. In this section, we indeed show that the regularized LCD 
of A~^X is large for every fixed X. This will be used later along with a decoupling 
argument to bound the small ball probabilities for {A~^X,X). 

Recall that the values of constants cq, ci, Cqo are already chosen in Remark l4.3t 
they depend only on parameters K, M4. 

Theorem 7.1 (Structure theorem). Let A be a random matrix which satisfies 
(A) . There exist c^jj] > ^''^^ Lq > 1 that depend only on the parameters K and 
M4 from assumptions (12. 3p . (12. 5p . and such that the following holds. Let u G 
be an arbitrary fixed vector, and consider xq := A^-'^ii/||A~-^?x||2. Let L > Lq and 
n~'\III\< A < Coo/ 3. Consider the event 

£ = |xo G Incomp(co, ci) and Dl{xq, A) > L^^n1ZiZl^^|. 

Then 

^{£^r\£K) < 2e-^. 

We shall first prove the easier fact that xq G Incomp(co, ci). The more difficult 
part of the theorem is the estimate on the LCD. Its proof will be based on the 
probability bound of Proposition 16.111 and nontrivial covering estimates for the 
sets of vectors with given LCD, which we shall develop in Section [7.11 

Lemma 7.2 {A^'^u is incompressible). In the setting of Theorem \7. 1\ consider 
the event 

£1 = {^^o £ Incomp(co, ci)}. 

Then 

F{£^ n £k) < 2e-'^. 

Here cjg > depends only on the parameters K and M4 from assumptions ()2.3p , 
(123]). 
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Proof. Denote x = A ^u; then Ax = u. Therefore 

£1 C |3x E M" : G Comp(co, ci) A Ax = u}. 

By Proposition IMl n < 2e-'El^ as claimed. □ 

7.1 Covering sets of vectors with small LCD 

Definition 7.3 (Sublevel sets of LCD). Let us fix X £ (0,Coo). For every value 
D > 1, we define the set 

Sd = G Incomp(co, ci) : Dl{x,X) < Z)}. 

Our present goal is to bound the covering numbers of Sd- 

Proposition 7.4 (Covering sublevel sets of regularized LCD). There exist C^m ( fm > 
which depend only on co,ci, and such that the following holds. Let A G 
(Cfr^/'T'; Coo/3) and L > 1. For every D > 1, the sublevel set Sd has a f3-net 
N such that 

^^\D ' ' ' - [(An)<E3j 

The main point of this result is the presence of the term (An)'Eil S> 1 in 
the estimate of the cardinality of N . This makes \N\ substantially smaller than 
(3//3)"', which is a trivial estimate on the /3-net for the whole sphere S""""^, see 
Lemma 12.21 

The proof of Proposition 17.41 relies on a series of lemmas of increasing gen- 
erality. We begin by covering the level sets of the usual (not regularized) LCD. 
We shall work in a lower dimension m for the time being; the definition of LCD 
is thus considered in M"*. 

Lemma 7.5. Let c G (0, 1), Dq > c^/m > 1 and L > 1. Then the set 

{xG5"-1: DUx) G iDo,2Do]} 
has a (3-net M such that 



Do \ ^Jm 

Here C depends only on c. 

Proof. Let a; be a vector from the set in question. By the definition of LCD, 
there exists p G 77^ such that 



\\Dl{x)x - ph < LJ\ogj2Do/L). (7.1) 
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Dividing both sides by Dl{x) and using trivial estimates in the right hand side, 
we get 

P 



Dl{x) 



< 



LVlog(2^o) 







Since ||x||2 = 1, the last two inequalities imply that 

P 



< 



2LVlog(2Do) 



Moreover, since ||x||2 = 1, we have 

\\ph < \\Dl{x)x-p\\2 + \\Dl{x)x\\2 < L^/log+(2L»o/i) + 2Do < 4Do. 
This shows that the set 



P 



: p G 



n4Z?, 



is indeed an /3-net of the set in question. Counting the number of integer points 
in a ball by a standard volume argument, we estimate 



|AA|< 1 + 



UDr 



CDr 



_ < 

V V'^ J 

This completes the proof. □ 
The next step is toward removing the lower bound for Dl(x) in Lemma l7. 51 
Lemma 7.6. Let c G (0, 1), D > Dq > c^/m > 1 and L>1. Then the set 

{xeS"^-' : Dl{x) G {Do,2Do]} 
has a 13-net M such that 



D - I- I — \^^^^ 
Here C depends only on c. 

Proof. By Lemma [731 we can cover the set in question with Euclidean 

balls of radius /3o = centered in the set, where Cq depends only on 

c. If /3o < /3 then the lemma is proved. Assume that /3q > (3. We can further 
cover every ball of radius /3o by balls of the smaller radius 13/2. According to 
Lemma [221 the number of smaller balls per larger ball is at most 



1 + 



4^ 



< 



5^ 



< 



31) 
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The total number of smaller balls is then at most 




By enlarging the radius of the balls from /3/2 to (3 as in Remark 12. H one can 
assume that they are centered in the set in question. This completes the proof. 

□ 

Now we can remove the flexible lower bound on Dl{x) in Lemma 17.51 
Lemma 7.7. Let c G (0, 1) such that D > c^/m > 2 and L > 1. Then the set 

{x e S""-^ : < Dl{x) < D} 

has a (3-net M such that 

Here C depends only on c. 
Proof. We decompose the set 

[x e S""-^ : DLix) <D}c\J{xe S""-^ : Dl{x) G {2'^ D ,2'^+^ D]} , 

k 

where the union is over the integers k such that the interval {2-'^D,2-^+^D] 
has a nonempty intersection with the interval {c\fm^ D\ . The assumptions imply 
that all such k are nonnegative and 2~^D > C\pml2 > 1. So there are at most 
log2 D terms in this union, and for each term one can construct an /3-net using 
Lemma 17.61 The union of these nets forms a required net M. □ 

Further, we remove the normalization requirement from the set to be covered. 

Lemma 7.8. Let c G (0, 1) such that D > C\/rri > 2 and L > 1. Then the set 

{x e : c^/^ < Dl{x/\\x\\2) < D} (7.2) 

has a l3-net N such that 

Here C depends only on c. 
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Proof. Let A/q be a /3-net of the set {x G 5»"-i ; c-^/m < Dl{x) < Z)} as in 
Lemma PfTfl For each x S A/q, let A^^^ denote a /3/2-net of the interval span(x) n 
such that |A^^| < 4//3. Then 7\A := Ux^Afo-Mx clearly forms a /3-net of the 
set in (j7.2p . and 

A trivial estimate of the right hand side completes the proof. □ 



Proof of Proposition \7.4\ 

Step 1: decomposition. Consider a vector x G Sd- Recall from Section [6.21 that 
spread(x) C [n] and |spread(x)| = [coon]. Let us decompose spread(x) into 
disjoint sets 

spread(x) = /i U • • • U Ifc^ U J 

for some ko such that 

= [An] for /c < /cq, I-^I < [An], 

and so that the sets fill spread(x) from left to right, i.e. sup/fc < inf J^+i and 
sup/fc < inf J for all k. Since A < Cqo, we have ko > 1. Moreover, let 

/o = N\(/iU---u4J. 

This produces a decomposition of [n] into disjoint sets 

[n] =/oU/iU---4o. (7.3) 

This decomposition is obviously uniquely determined by the subset spread(2;), 
and it does not otherwise depend on x. 

We notice two useful bounds that will help us later. Since Ii U • • • U /fey = 
spread(x) \ J, we have 

|/i U • • • U 4J > \coon] - \Xn] > Coon/2 (7.4) 

and 

ko < ^ < (7.5) 
I An I A 

Step 2: constructing nets for each component. Let consider a fixed decompo- 
sition (j7.3p . and decompose the vector x accordingly: 

X = (x/y , x/j , . . . , xj, ) . 
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We are going to construct separate /3-nets for each component x/^, and combine 
them in to one net for Sd- 

A net A/q for the first component of x is chosen trivially. Note that x/^ S i?2° • 
By Lemma 12.21 we can choose a (l/D)-net A/q of -Bg" with 

For the other components of x, we will choose /3Q-nets non-trivially, where 



ft = ™M. (7,6) 

To this end, let us fix /c < /jq- Since x G Sd, the definition of the regularized 
LCD yields that 

Dl{xiJ\\xi J2) < Dl{x,X) < D. 
On the other hand, the argument in Lemma 16.81 yields 

Dl{xi^^/\\xi^\\2) > qeM/Xn. 

By the assumptions, 



(We can choose a value of large enough so that this holds) . Thus 

DL{xjJ\\xjj2)>^V\h\>2- 

We have shown that x/^. belongs to the set 

Vk := {y G Bt ■■ ^^Ah\ < DUy/Wyh) < D}. 

By Lemma 17.81 there exists a /3o-net A4 of with 



Step 3: combining the nets. We are going to combine the nets Mk into one net 
for Sd- So far we have shown that for every x S Sd, there exist a decomposition 
(j7.3p and nets A/q, A/i, . . . , A/feo which are uniquely determined by the index set 
spread(x), and there exist vectors € A/jt such that 

Ik/fc - 2/4II2 < /3o, A; = 0, 1,... ,A;o- 

Consider the vector 

y = {yhiVhi- ■ ■ iVho)- (7-7) 
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'=0 - 1/2 



It follows that 

k=0 

By (j7.5p and since A < Cqo, we have kQ + 1 < 3coo/^- Recalling the definition 
(|7.6p of /3o, we conclude that 



,, ,, ^ 7V^LVI^ ^ LVlog(2Z)) ^ 

where we used that the value of Cqo can be chosen small enough (smaller than 
1/49 in this case). 

Consider the set M of vectors y that can arise in (j7.7p . We showed that M is 
a /3-net of So- Moreover, since the index set spread(x) can be chosen in at most 
2" ways, it follows that 

lA^I < 2"|A/o||M| • • • |A4ol < 2" • {3Dp\ ■ TT (^f'^D^ 

fe=i vMfcK 

To simplify this bound, note that YlkLi \^k\ > Coonj^ by (j7.4p and that X^feLo l-^^l ~ 
n and > An > 1 by construction. It follows that 

Estimate (j7.5p on /cq implies that 2^0 ^ 1/A, which completes the proof of Propo- 
sition El □ 



7.2 Proof of Structure Theorem I7.1L 

In Proposition 16.111 we estimated the small ball probabilities for the random 
vector Ax for a fixed vector x. Now we combine this with the covering results 
of the previous section to obtain a bound that is uniform over all x with small 
regularized LCD. Recall that So denotes the sub-lebel set of regularized LCD 
according to Definition 17. 3i 

Lemma 7.9 (Small ball probabilities on a sublevel set of LCD). There exist 
c, c/ > and Lq ^ 1 ^^^^ depend only on the parameters K and M4 from the 
assumptions (j2.3p . (j2.5p . and such that the following holds. Let L > Lq, < 
A < Coo/3 andl<D< L-'^n"/^. Then 

F{3x G Sd : \\Ax - u\\2 < ^£K] < n"^'", (7.8) 

where 

_ L^\og{2D) 
^ VXD ■ 
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Proof. We will first compute the probability for Sd \ S£)/2 instead of Sd in ()7.8 
Proposition 16.111 implies that for every x £ Sd \S£)/2, one has 



'{WAx-uh < eV^} < 



WTTV ^£ M6TTTV 



D 



n— [An] 



e>0. 



Let us use this inequality for e = AKf3. Clearly, the term -1= dominates the term 

V A 

• So we obtain 



»{pX - U\\2 < AKjjy/^} < 



XD 



n— [An] 



--■■Po- 



(Here the constant C' = C'{K, M4) absorbs the factor K.) 

Let us choose a /3-net M of Sd\Sij/2 according to Proposition [731 The union 
bound yields 



'{3x G Af : \\Ax - u\\2 < 4K^^/^} < \Af \ ■ po 



< 



(An)IIll 



D 



l/A 



n— [An] 



=: Pi. 



One can estimate pi using the assumptions that n is sufficiently large, < A < 
Coo/3 and 1 < D < L^'^n'^^^. Choosing the constant c > sufficiently small and 
making simplifications, we obtain 

Pi < n-""''. 

Suppose event £k occurs, and suppose there exists x G 5^1 \ S'£)/2 such that 
\\Ax — u\\2 < K(3y/n. There exists xq £ J\f such that \\x — X0II2 ^ Then 

11^4X0 - u\\2 < \\Ax - u\\2 + \\A{x - Xo)\\2 < \\Ax - u\\2 + \\A\\\\x - X0II2 
< Kf3^ + 3K^ ■ (3 = AK/d^/a. 

As we know, the probability of the latter event is at most pi < n~'^"". So we 
have shown that 



"{Bx e Sd\ Sd/2 ■■ \\Ax - U\\2 < K^y/^ A Ek] < 



n 



Finally, we get rid of 5£)/2 in this bound. Since /3 decreases in D, as long as 
D/2 > 1 the previous result can be applied for D/2 instead of D, and we get 



"{3x G Sd/2\Sd/a ■■ \\Ax-u\\2 < K/3^a£k} < 



n 



We can continue this way for SD/4\SD/g, etc. So we decompose S = \^^=o(^2~'^d\ 
S2~k~i£)), where /cq is the largest integer such that l^'^'^D > q^jf/Xn. (Recall 
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that by Proposition [Hill the set Sdq is empty if Dq < qa:stv ^'n- Since qpiv An > 1, 
we have < log2 D. The union bound then gives 

¥{3x e Sd : \\Ax - u\\2 < Kp./^ A 8k] < h ■ n-'^"" < log2(i?)n-^"" < n"^'" 
if the constant c' > is chosen appropriately small. This completes the proof. □ 



Proof of Structure Theorem \7.1\ We fix constants c,c',Lq given by Lemma 17.91 
Consider the following two events: 

So = {Dl{xo,X) > L~'^rf^^ =: Dq or Dl{xq,X) is undefined}, 
£i = {xq G Incomp(co, ci)}. 

Recall that if £i holds then Di{xq, A) is defined. So our desired event £ can be 
written as 

£ = £ir\£Q. 

Then <£''^ = <£'f U {£i n £^) = U {£i n £q). Finally, the event whose probability 
we need to estimate is £'^ H £k C (ff n £k) U {£i n <f^Q n £k)- Hence 

P(£:" n £k) < H^l n £k) + H£i n £^ n £k)- 

The first term was estimated in Lemma 17.21 as 

¥{£lr\£K) < 2e-'^. 

It remains to obtain a similar estimate on the second term ¥{£i n iSq n £k)- We 
can express 

£in£^n£K = {xo := A-'^u/\\A~'^u\\2 G Sdo A £k}. 

Let uq := Axq = u/\\A~^u\\2- Event £k implies that \\uo\\2 = \\Axo\\2 < ||^|| < 
3K^/n. Therefore uq lies on a one-dimensional interval: 

Mo G span(ti) n 2,K^/nB2 =: E. 

So 

£"1 n <fo n ifif C {3x0 S 3no e : Axo = tio A £k]- 
In view of an application of Lemma 17.91 let us choose 



LVlog(2ZJo) 

Let Ai denote some fixed (K/3o\/n)-iiet of the interval E, such that 

\M\< ^ =-<QDo. 
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So, for every uq £ E we can choose vq £ Ai such that ||tto — ^^olb < K(iQy/n. 
Since Axq = uq, it follows that \\Axq — ^0112 < K^Q^/n. We have shown that 

Sir\8^r\8K^ {3x0 e Sd^, ^vq^M-. \\Axq-vq\\2 < K(3o^a£k}. 

An application of Lemma 17.91 and a union bound over vq £ Ai give 

P(£:i n £^ n£K)<\M\- n"^'" < 6Do ■ n-"'"" < n'^'^/^ 

where we used that Dq < rfl^^ and since we can assume that constant c > 
appropriately small. The proof of Structure Theorem 17.11 is complete. □ 



8 Small ball probabilities for quadratic forms 

Now that we developed a machinery for estimating small ball probabilities, we can 
come back to our main task, estimating the small ball probability for quadratic 
forms. Recall that by Proposition 15.11 and Remark 15.21 the distance problem re- 
duces to estimating Levy concentration function for the self-normalized quadratic 
forms: 

{A-^X,X) 



i 



^l + \\A-^X\\l 



<? (8.1) 



Here and throughout this section, A denotes the n x n symmetric random ma- 
trix satisfying assumptions (j2.5p . X denotes a random vector whose entries are 
independent of A and of each other, identically distributed, and satisfy the same 
moment assumptions (j2.5p as those of A, namely they have zero mean, unit 
variance, and fourth moment bounded by Mf. 

The goal of this section is to prove the following estimate. 

Theorem 8.1 (Small ball probabilities for quadratic forms). Let A be an n x n 

random matrix which satisfies (A), and let X be a random vector in M" whose 
entries are independent of each other and of A, identically distributed, and sat- 
isfy the same moment assumptions (12. 5p as those of A, namely they have zero 
mean, unit variance, and fourth moment bounded by Mf. There exist constants 
Qsiy 08771 > that depend only on the parameters K and M4 from the assumptions 
(|2.3p . (j2.5p . and such that the following holds. For every e > and every n G M, 
one has 

IP{ '^f''^'f^";j <eA£K] < qnf'^' + 2eM-n'^)- (8-2) 

In particular, we have a desired bound for Levy concentration function in 
(I8TD . namely Qs^f^^^ + 2exp(-n'Eil) + P(f|,). 

To prove Theorem 18. 11 we will first decouple the enumerator {A^^X, X) from 
the denominator y'T+TTA^^'TxIII by showing that ||yl^-'^X||2 ~ ||^^-'^||hs with 
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high probabihty. This is done in Section 18.11 Then we decouple the quadratic 
form {A^^X, X). An ideal decoupling argument would replace {A'^X, X) by 
{A~^X, X') where X' is independent random vector; our argument will be of 
similar nature. Then by conditioning on X we obtain a linear form, and we can 
estimate its small ball probabilities using the Littlewood-Offord theory (specif- 
ically, using Proposition 16.91 and Structure Theorem 17. ip . This will be done in 
Section 18.31 

8.1 SizeofA-^X 

The following result compares the size of the denominator \/l + \\A^^X\\2 ap- 
pearing in (j8.2p to ||A~-'^||hs- 

Proposition 8.2 (Size of A^^X). Let A be a random matrix which satisfies 
(A) . There exist constants c, Cfegi, > that depend only on the parameters K 
and M4 from the assumptions (12. 3p . (|2.5p . and such that the following holds. Let 
n^^ < A < c. The random matrix A has the following property with probability 
at least 1 — e"*^". If £k holds, then for every e > one has: 

(i) with probability at least 1 — e~'ElF in X , we have 

\\A-^X\\2 > Cfeu 
(a) with probability at least 1 — e in X , we have 

\\A-^X\\2<e-"^A-^\\us; 

(Hi) with probability at least 1 — Cfeep/VA — n~'^^'^ in X, we have 

\\A-^X\\2>e\\A-H^^. 

The proof of this result uses the following elementary lemma. 

Lemma 8.3 (Sums of dependent random variables). Let Zi, . . . , Z„ be arbitrary 
non-negative random variables (not necessarily independent), and pi, . . . ,pn be 
non-negative numbers such that 

n 

k=l 

Then for every e G M one has 

n n 
k=l k=l 
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Proof. By Markov's inequality, the event X]fc=i PkZk < e implies Pk^{z^.>2e} < 
1/2 and, consequently, YlkPk^{Zk<'2e} > 1/2- Therefore, 



k=l 



It, 

»{ J^PfcZfc < e} < P{ 5^Pfcl{Z,<2e} > 1/2} 

< 2E^^pfel|^^<2£} (again by Markov's inequality) 

k 

n 

= 2^PfcP{Zfc < 2e}. 



k=l 

The proof is complete. □ 

Proof of Proposition \8.SX Let ei, . . . , e„ denotes the canonical basis of M", and 
let 

_ A-'^Ck _ 
Xfc := Tj— Tj— , K=l,...,n. 

11^" efclb 

Let us apply Structure Theorem 17.11 combined with the union bound over k = 
1, . . . ,n. We do this with L = Lq a suitably large constant depending on pa- 
rameters K and M only (chosen so that Proposition 16.91 can be applied below). 
We see that the random matrix A has the following property with probability at 
least 1 - n • le''^ > 1 - le''^/'^: if 8k holds then 



Xk 



G Licomp(co,ci), i)L(2;fc,A) > L-^nlEU/-^ k = l,...,n. (8.3) 



Let us fix a realization of A with this property. We shall deduce properties (i), 
(ii), (iii) from it. Without loss of generality we may assume that £k holds. 

(i) We have 

ll^lb < p||p^^X||2. 

ByiS/f, we have \\A\\ < 'iK^fn. Moreover, Lemma r3.2l and Tensorization Lemma[37 
imply that the random vector X satisfies ||X||2 > c'-^/n with probability at least 
1-6-^='", for some constant c' = c'{K,M) > 0. It follows that p-iX||2 > c'/3K 
with the same probability, so part (i) of the proposition is proved. 

(ii) Using that j4 is a symmetric matrix, we express 

n n n 

\\A''X\\l = J2{^''X,ekf = ^(A-iefc,X)2 = ^ \\A-'ek\\l {xk,Xf. (8.4) 

fc=l k=l k=l 

Recall that the coordinates of X are independent random variables with zero 
mean and unit variance. Therefore Kx{xk, X)'^ = 1 for all k, so 



Ex\\A-^Xg = Y,U-'ek\\l = \\A 



-1||2 
HS- 



k=l 
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An application of Markov's inequality yields part (ii) of the proposition. 

(in) Fix k < n. Then {xk,X) can be expressed as a sum of independent 
random variables Y^l^iXkiXi, where Xki and Xi denote the coordinates of Xk 
and of X respectively. This sum can be estimated using Proposition 16.91 (with 
J = [n]) combined with the estimate (|8.3|) on the regularized LCD of x^. This 
gives 

Fx{\{xk,X)\<V2e} <q^(^^ + L^n-'^^y (8.5) 

Now we combine these estimates for all k using (j8.4p and Lemma 18.31 with pk = 
P"^efc||i/||A"^||Hs; note that Y2=iPk = 1- We obtain 

n 

¥x{\\A-'X\\2 < e\\A-'\\us} = F{^Pk{xk,Xf < e^} 

k=l 
n 

<2'^PkF{{xk,Xf <2e'^} (by Lemma [83]) 

k=l 




This proves part (iii) , and completes the proof of Proposition 18. 2[ □ 
8.2 Decoupling quadratic forms 

Decoupling the quadratic form {A^^X, X) is based on the following general result. 
Similar decoupling techniques for quadratic forms were first applied by Gotze [6] 
and used in literature many times since then; in particular such a decoupling 
argument was used in [H |3] in a context similar to ours. 

Lemma 8.4 (Decoupling quadratic forms). Let G he an arbitrary symmetric 
n X n matrix, and let X be a random vector in M" with independent coordinates. 
Let X' denote an independent copy of X. Consider a subset J C [n]. Then for 
every e > one has 

CUGX,X),ef = supP{|(GX,X) - u\ < 

< ¥x,x'[\{G{P.j.{X - X')),PjX) -v\<e} 

where v is some random variable whose value is determined by the x J'^ minor 
of G and the random vectors PjcX, PjcX' . 

The point of this result is that, upon conditioning on the coordinates of X 
and X' in J^, the vectors xq := G{Pj{X — X')) and v become fixed. So the Levy 
concentration function of the quadratic form {GX, X) gets bounded by the Levy 
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concentration function of the linear form {xq, PjX) . The latter, as we know, can 
be estimated using the Littlewood-Offord theory. 

The proof of Lemma 18.41 is based on the general decoupling lemma from |16) , 
which was already used for a purpose similar to ours in [3]. 

Lemma 8.5. Let Y and Z be independent random variables or vectors, and let 
Z' be an independent copy of Z. Let £{Y,Z) be an event which is determined by 
the values ofY and Z. Then 

F{£{Y,Z)}'^ <F{£{Y,Z)n£{Y,Z')}. □ 

Proof of Decoupling Lemma \8.4\ By permuting the coordinates, without loss of 
generality we can assume that J and are intervals of coordinates with sup J < 
inf J'^. The decomposition [n] = JU J*^ induces the decomposition of the matrix 
A and all the vectors in question, 

^=(f* h)^ ^=(z)' ^'=(z')^ i^*^=(z' 

Here E is a J x J minor of G, H is a J x J'^ minor, etc., and similarly Y G M'^, 
Z G M'^", etc. Let us fix a u G M and apply Lemma |8.5| this gives 

p2 := F{\{GX,X) -u\< e}^ < ¥^^{\{GX,X) - u\ < e A\{GX,X) - u\ < e}. 

(8.6) 

By the triangle inequality, 

< ^x,x{\iGX,X) - {GX,X)\ < 2e}. 
By our decomposition, we have 

{GX, X) = {EY, Y) + 2{FZ, Y) + {HZ, Z), 
{GX,X) = {EY,Y) + 2{FZ',Y) + {HZ',Z'). 

Hence 

{GX, X) - {GX, X) = 2{F{Z - Z'),Y) + {HZ, Z) - {HZ', Z'). 

Recall that F is the restriction of the matrix G onto the pairs of coordinates in 
J X J^, that Z — Z' is the restriction of the vector X — X' onto the coordinates 
in J^, and that Y is the restriction of X onto the coordinates in J. So 

{F{Z - Z'),Y) = {G{Pjc{X - X')),PjX). 

Similarly we can see that the value of {HZ, Z) — {HZ' , Z') depends on the J'^ x J*^ 
minor H and on the restrictions of X and X' onto the coordinates in J'^. So 
setting V = 2{HZ, Z) — 2{HZ' , Z'), we express 

{GX,X) - {GX,X) = 2{G{Pjc{X - X')),PjX)+v. 

This and (|8.6p completes the proof of Decoupling Lemma 18.41 □ 
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8.3 Proof of Theorem \8A\ 



Our argument will be based on decoupling the quadratic form {AX, X) , and 
treating the resulting linear form using the Littlewood-Offord theory developed 
earlier in this paper. 

Step 1: Constructing a random subset J and assignment spread(3;). 
The decoupling starts by decomposing [n] into two random sets J and J^. To 
this end, we consider independent {0, l}-valued random variables 6i, . . . ,5n ("se- 
lectors") with E(5j = Coof^- (Recall that the constant Cqo, which depends on K 
and M4 only, was fixed in the definition of the regularized LCD in Section [6.2[ ) 
We then define 

J ■={ie [n] : 6i = 0}. 

Then E| J'^l = Coon/2. By a basic result in large deviations (see e.g. [1] Theo- 
rem A. 1.4), the bound 

\r\ < Coon (8.7) 

holds with high probability: 

Pr{ (f87ni holds} > 1 - 2e-^'°°" 

where c^^ = clJ2. 

Consider a fixed realization of J that satisfies (18. 7p . As we know from Sec- 
tion [6i2l at least 2coon coordinates of a vector x G Incomp(co, ci) satisfy the 
regularity condition (16. 4p . It follows that for each vector x E Incomp(co, ci) we 
can assign a subset 

spread(2;) C J, |spread(x)| = [coon] (8.8) 

and so that the regularity condition ()6.4p holds for all k E spread (x). If there 
is more than one way to assign spread (x) to x, we choose one fixed way to do 
so. This results in a valid assignment (per Section 16. 2p that depends only on 
the choice of the random set J. We shall use this assignment in applications of 
Definition 16.61 of the regularized LCD of x. 

Step 2: Estimating the denominator y^l + \\A^^X\\'2 and LCD of the in- 
verse. Lemma r8.2l will allow us to replace in (j8.2p the denominator y^l + ||yl^^X 
by ||A~^||2. However, we have to do this carefully in order to withstand losses 
that will occur at the decoupling step. So, let eo E (0, 1) and let X' denote an 
independent copy of the random vector X. We consider the following event that 
is determined by the random matrix A, random vectors X, X' and the random 
set J: 

^Jl + \\A-^X\\l<\\A-'\\us < -\\A-\Pj4X-X'))h. (8.9) 
^ ^0 
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Recall that the coordinates Xi of X are independent random variables with zero 
mean, unit variance, and "EXf < Mf. It follows that the coordinates Yi = 
5i{Xi — X'j) of the random vector Y := Pj<^{X — X') are again independent 
random variables with 

EYi = 0, EY^ = Coo, ^Y^ < ScooMl 

We see that Proposition 18.21 applies for X, and also for X replaced by Coo X 
(in the latter case with M4 replaced by 2coo^'^M4). It follows that 

rA,x,x',jm holds v£i.}>l-^- n-^'/A _ 2e--'" 

where C ,c' > depend only on K and M4. 
Consider the random vector 

._ A-\PMX-X')) 

- p-i(p,.(x-xo)|b- ^'-''^ 

(If the denominator equals zero, assign to xq an arbitrary fixed vector in S"""^.) 
Let us condition on an arbitrary realization of random vectors X, X' and on a 
realization of J which satisfies (|8.7p . Fix some value of the parameter A satisfying 
n~'^Lli < A < Coo/ 3 as required in Structure Theorem 17. 11 and consider the event 



Xq 



G Incomp(co,ci) and Dlo{xo, X) > C"n''" (8.11) 



which depends on the random matrix A. By Structure Theorem 17. H the condi- 
tional probability is 

Pa {(EH]) holds V£'^\X,X',J satisfies ([HZD} > 1 - 26-""''. 

Here Lq, C", c" > depend only on K and M4. 
Combining the three probabilities, we obtain 

PA,x,x',j{imi), (ED, (EUl) hold) v£:^} 



=:l-po. (8.12) 
It follows that there exists a realization of J that satisfies (18. 7p and such that 

Pa,x,x'{((E31), hold) V£:^} >l-po. 

Let us fix such a realization of J for the rest of the proof. An application of 
Pubini's theorem shows that the random matrix A has the following property 
with probability at least 1 — y/po'- 

rx,x'{i&, dHH]) hold) V fi- 1 .4} > 1 - VP^. 
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But the event if^ depends on A only and not on X or X' . Therefore, the random 
matrix A has the fohowing property with probabihty at least 1 — y/po- Either 



£^ holds, or: 

£k holds and Px,x' { (ESI) , dSTT]) hold \A] >1- (8.13) 

Step 3: decoupling. The event we are interested in is 

f \{A-^X,X)-u\ ^ \ 

"Ix/i + P-^ll^ - J- 

We need to estimate the probability 

FA,xi£ r]£K)< Fa,x{£ a <Km holds} + ¥a,x{£k A (l8T3]l fails}. 

By the previous step in the proof, the second term here is bounded by ^/po■ 
Therefore 

Fa,x {£n£K)< sup Px {£\A) + 

A satisfies (8A3l 

Computing the same probability in the larger space determined by the random 
vectors X,X', and using property (|8.13p . we write 

¥A,x{£r]£K)< sup Px.x'l-f A dSJD holds] A} +2VP0- (8.14) 

A satisfies I l8.13t 

Let us fix a realization of a random matrix A satisfying (jS.lSp for the rest of the 
proof. So our goal is to bound the probability 

Pi ■■= Pvv/{g A holds}. 
Using definition of £ and the first inequality in property (j8.9p . we have 
Pi < Px,x'\\{A-'X,X) -u\< ^p-i||Hs|. 

We apply Decoupling Lemma [831 and obtain 

pI < Fx,x'{So} 

where 

£o = \\{A-HPj4X-X')),PjX)-v\ < ^p-i||Hs| 

and where v = v{A^^ , PjcX, PjcX') denotes a number that depends on A~^, 
PjcX, PjcX' only. Further, using property (|8.13|) (in which conditioning on A 
is no longer needed as we are treating ^ as a fixed matrix), we get 

pI < Fx,x'{£o} < Fx,x'{£o A «, (ill]) hold} + v^. 
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Let us divide both sides in the inequahty defining the event Sq by ||^ ^{Pjc(X — 
X'))\\2. Using definition (jS.lOp of xq and the second inequality in ()8.9p . we obtain 

pl<rx,x'{\{xo,PjX) -w\ < £0 ^^'^ A iH]) holds} + (8.15) 

where w = w{A~^ , PjcX, PjcX') is an appropriate number. 

Step 4: The small ball probabilities of a linear form. By definition, the 
random vector xq is determined by the random vector Pjc{X — X'), which is 
independent of the random vector PjX. So if we fix an arbitrary realization of 
the random vectors PjcX and PjcX' , this will fix the vector xq and the number 
w in (j8.15p . Since moreover (j8.1ip is a property of xq, we conclude that 

pj < sup ¥pjx{\{xo,PjX) -w\< eo^/'ej + 

Xq satisfies HS.lll l 

So let us fix a vector xq = (xqi, . . . , xon) G S"^^^ that satisfies (18. lip and a number 
ti; G M. We have reduced the problem to estimating the small ball probabilities 
for the sum of independent random variables 



{xo,PjX) = y^xoki 



k 



where we denote X = (^i, . . . , ^„). 

We can apply Proposition 16.91 for this sum, noting that by (18. Sp we have 
J 5 spread(xo) 5 I{x) as required there. (The last inclusion follows by the 
definition of the maximizing set /(x), recall Definition 16.61 ) It follows that 

Pp,.{|(xo,P.X) - .| < e~^'\} < + 

^ VA Dlo{xq,\) 

for some Ci = Ci(K, M4). Using property (|8.1ip to bound the second term in 
the right hand side, we obtain 

n -3/2 

Pi S h + ^po, 

Now we estimate the probability of the desired event in ()8.14p as 

^A,x{£n£K)<Pi + 2^ 

Cl£o^^^g \l/2 , , ./^.l/2 1/4 



-yvr") +{c[n-^^T +pr+^v^o. 



Recalling the definition (j8.12p of po and simplifying, we obtain 
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Step 5: Optimizing the parameters. This inequality holds for all £q > 0, so 

we can optimize in eq- Setting eq = e^/'^ /\^/^ , we obtain after some simplification 
that 

¥aA£ n £k) < + C^n-^i/A ^ c[e~<^. 

By assumption, A > n~'Eil where qfj\ > is a small constant. So, for appropri- 
ately chosen constants, the term n~'^i/'^ dominates the term e~'^i". We obtain 

Pa,x{£ n £k) < + 2C[n-</\ 

Recall that this inequality holds for all e > and n~'^^ < A < Coo/3, so we can 
also optimize in A. For convenience, we isolate this step as a separate elementary 
observation. 

Fact 8.6 (Optimization). Let C > 1, a,h,c! ,c> 0. There exists numbers Cq and 
uq that depend only on a, b, c' ,C,c and such that the following holds. Let n > hq. 
Consider a function p{£) : [0, 1] — t- which satisfies 

p{e) < We^ + n"^'^^ for all e G [0, 1] and C < M < n". 

Then 

p{e) < Coe^-0-°i + n-^'"' for all e G [0, 1]. 

Proof of Fact [KM Choose some number C < Mq < whose value will be de- 
termined later. By the assumption, the inequality 

p < Mge^ + n-^'*^o < {MS + l)e* (8.16) 

holds for all e > n^'^'^^o/b^ ^^le other hand, using the assumption with M = n^, 
we see that the inequality 

p <n e ^ n <e -|-n 

holds for all e < ri~^^^^'^. Let us choose Mq as the minimal number such that 
Mq > C and c'M^/b > lOOac. Note that we have C < M < as required, for 
sufficiently large uq. Therefore, every e belongs to the range where inequality 
(j8.16p holds or (j8.17p holds, or both. So at least one of these inequalities holds 
for all e > 0. This completes the proof with Cq = Mq + 1. □ 

Applying Fact 18.61 with M = 1/A, a = 5/32 and b = 1/8, we conclude that 

^aMS n £k) < Cqs'/' + n-^'"^ (8.17) 

holds for all e G [0, 1], where c = q^j^ Since we can choose Cq > 1, the same 
inequality trivially holds for e > 1 as the right hand side becomes larger than 1. 
The proof of Theorem 18.11 is complete. □ 
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9 Consequences: the distance problem and invertibil- 
ity of random matrices 

9.1 The distance theorem 

An application of Theorem 18.11 together with Proposition 15.11 produces a satis- 
factory solution to the distance problem posed in the beginning of Section [5l 

Corollary 9.1 (Distance between random vectors and subspaces). Let A he a 
random matrix satisfying (A) . There exist constants C, c > that depend only 
on the parameters K and M4 from (|'2.3|) . (|2.5p . and such that the following holds. 
Let Ak denote the k-th column of A and denote the span of the other columns. 
For every e > 0, one has 

P{dist(Afc,Ffc) <eh£K] < (%7p^/^ + 2exp(-n'EIl). 

Proof. By permuting the coordinates, we can assume without loss of generality 
that k = 1. Proposition 15.11 states that 



dist(Ai,i?i) 



{B-'X,X)-au\ 

2 
2 



,Jl + \\B-^X\ 



where B denotes the (n — 1) x (n — 1) minor of A obtained by removing the first 
row and the first column from A and X € M"^^ denotes the first column of A with 
the first entry removed. By assumptions, i? is a random matrix which satisfies 
the same assumptions (A) as A (except the dimension is one less), and X is an 
independent random vector whose entries also satisfy the same assumptions (I2.5P . 
So we can apply Theorem 18.11 for B and X. Conditioning on the independent 
entry an = u, we obtain that 



\{B-^X,X) - aii\ 1 

:J,'l - <e^£K]< G^'" + 2exp(-(n 



^1+ ||S-1X||2 

This completes the proof. □ 



9.2 Invertibility of random matrices: proof of Theorem II. IL 

We can now derive the main result of the paper. Theorem 11.11 In Section 12.31 
we reduced the problem to proving the invertibility bound (|2.4|) . We shall now 
establish this bound, which immediately implies Theorem ll.il 

Theorem 9.2 (Invertibility of symmetric random matrices). Let A be a random 
matrix which satisfies (A). Consider a number K > 0. Then, for all £ > 0, one 
has 

p|mm|Afc(A)| < en-^/^ ^ < g^j < CeV9 + 2 exp(-n'=). 
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where C, c > depend only on the fourth moment bound M4 from ()2.5p and on 
K. 

Proof. Denote by p the probability in question. As we observed in Section 12.31 

p = p( min \\Ax\\2<£n^^/'^^£K\■ 

In (|3.1|) . we split the invertibility problem into two, for compressible and incom- 
pressible vectors: 



p<p| inf \\Ax\\2<en-^/'^^£K\ 

I- a;gComp(co,ci) J 

»| inf \\Ax\\2<£n-^^'^ ASkV 



ceComp(co,ci) 

+ : 



The values of co,ci were then fixed in Remark 14.31 The probability for the 
compressible vectors is bounded by 2e~'Eir- by ()4.1ip . The probability for the 
incompressible vectors is estimated via distances in Lemma [3.9l see Remark l3.101 
This gives 



1 

p < 2e-1^ + p| distMfc, Hk) < c7^e A Sr]- 

Finally, the distances are estimated in Corollary I9.H which gives 

p < 2e-'^ + (%tP^^^ + 2exp(-n'EI]). 

Choosing the values of the constant c > sufficiently small, we complete the 
proof of Theorem 19.21 □ 
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